Skip to content

[Experimental WIP] feat: implement extended containers (ADR 001) to lift 13-bit size limit#10

Merged
quickwritereader merged 3 commits intoquickwritereader:experimentalfrom
heartical:feature/extended-containers
Apr 9, 2026
Merged

[Experimental WIP] feat: implement extended containers (ADR 001) to lift 13-bit size limit#10
quickwritereader merged 3 commits intoquickwritereader:experimentalfrom
heartical:feature/extended-containers

Conversation

@heartical
Copy link
Copy Markdown
Contributor

@heartical heartical commented Mar 23, 2026

feat: implement extended containers (ADR 001) and implicit array encoding (ADR 002)

ADR 001 - Extended Containers:

  • Add TypeExtendedTagContainer (2) with 32-bit offsets (SelfOffset + Continuation)
  • Implement ExtendedPutAccess with automatic segmentation (default 4KB pivot)
  • Implement ExtendedGetAccess for transparent chain reading
  • Add high-level pack functions: PackExtended, PackExtendedWithMapStr, etc.
  • Full integration with existing PackOS types

ADR 002 - Length-Agnostic Implicit Array Encoding:

  • Arrays detected when payload > 8 bytes
  • First byte indicates element size (1,2,4,8)
  • Element count calculated as (payloadSize - 1) / elementSize
  • Support int8/int16/int32/int64 and float32/float64 arrays
  • Add PutAccess.AddIntegerArray/AddFloatArray
  • Add PackInt64Array/PackFloat64Array high-level functions

Both ADRs are fully integrated:

  • Arrays >4KB automatically use extended containers
  • Complete test coverage for all scenarios
  • Backward compatible with existing scalar values

Closes ADR-001
Closes ADR-002

- Add TypeExtendedTagContainer (2) with 32-bit offsets
- Implement ExtendedPutAccess with automatic segmentation (default 4KB pivot)
- Implement ExtendedGetAccess for transparent chain reading
- Add high-level pack functions: PackExtended, PackExtendedWithMapStr, etc.
- Support int/uint types in generic packing
- Full test coverage for encoding/decoding and edge cases

Closes ADR-001
…ding (ADR 002)

ADR 001 - Extended Containers:
- Add TypeExtendedTagContainer (2) with 32-bit offsets (SelfOffset + Continuation)
- Implement ExtendedPutAccess with automatic segmentation (default 4KB pivot)
- Implement ExtendedGetAccess for transparent chain reading
- Add high-level pack functions: PackExtended, PackExtendedWithMapStr, etc.
- Full integration with existing PackOS types

ADR 002 - Length-Agnostic Implicit Array Encoding:
- Arrays detected when payload > 8 bytes
- First byte indicates element size (1,2,4,8)
- Element count calculated as (payloadSize - 1) / elementSize
- Support int8/int16/int32/int64 and float32/float64 arrays
- Add PutAccess.AddIntegerArray/AddFloatArray
- Add PackInt64Array/PackFloat64Array high-level functions

Both ADRs are fully integrated:
- Arrays >4KB automatically use extended containers
- Complete test coverage for all scenarios
- Backward compatible with existing scalar values

Closes ADR-001
Closes ADR-002
@quickwritereader
Copy link
Copy Markdown
Owner

…adding ExtendedContainer and ExtendedReader types with triplet tracking and BFS/DFS access.
@heartical
Copy link
Copy Markdown
Contributor Author

Key Features Implemented:

  • Triplet Tracking with Triplet struct tracking parent segment, nextOffset address, and actual segment for proper relationship management between nested containers used for automatic promotion to extended containers

  • BFS/DFS Style Element Access with NextSegment method for horizontal BFS traversal, PushSegment and PopSegment methods for nested DFS traversal, and GetBytes providing BFS access across all segments

  • Automatic Promotion to Extended Containers where nested containers automatically become extended when size exceeds pivotSize and large data automatically creates extended containers with proper header updates

  • Continuation Addresses with 32-bit offsets for extended containers lifting the 13-bit limit, proper chain linking with SelfOffset and Continuation fields, and EndOfChain marker for final segment

  • Simplified API with single ExtendedContainer type replacing ExtendedPutAccess and ExtendedGetAccess, consistent naming like AddInt16 and AddString instead of AddInt16Extended, and ExtendedReader providing unified reading interface

Technical Details:

  • ExtendedContainer struct contains segments slice, triplets slice, current PutAccess pointer, pivotSize int threshold, isExtended bool flag, parent ExtendedContainer pointer, and parentOffsetAddr int

  • Triplet struct contains ParentSegment byte slice, NextOffsetAddr int, ActualSegment byte slice, IsExtended bool, SelfOffset uint32, and Continuation uint32

  • ExtendedReader struct contains segments 2D byte slice, currentSeg int index, segmentStack slice of ExtendedReader pointers, and getAccess GetAccess pointer

Include buffer pooling via existing PutAccess and GetAccess sync.Pool, zero-copy operations where possible, automatic segmentation optimizing memory usage, and extended containers adding 8-byte header overhead per segment.

@heartical heartical closed this Apr 9, 2026
@heartical heartical reopened this Apr 9, 2026
@quickwritereader quickwritereader changed the base branch from main to experimental April 9, 2026 19:11
@quickwritereader quickwritereader added the enhancement New feature or request label Apr 9, 2026
@quickwritereader quickwritereader changed the title feat: implement extended containers (ADR 001) to lift 13-bit size limit [Experimental WIP] feat: implement extended containers (ADR 001) to lift 13-bit size limit Apr 9, 2026
@quickwritereader quickwritereader merged commit 740c41a into quickwritereader:experimental Apr 9, 2026
1 check passed
@quickwritereader
Copy link
Copy Markdown
Owner

@heartical bro I merged it, but it will not propogated to main with the current.
could you redo it the way I asked. firstly we should write encoding side. then decode and so on.

@quickwritereader
Copy link
Copy Markdown
Owner

quickwritereader commented Apr 9, 2026

@heartical
for an example we choose to omit the first chunk inside
this should be a simple test for string (for now as demo it will think 16bytes string as big).

type ExtendedContainerValue struct {
    NextSegmentOffset uint32
   //for now no additional info or self check, we can check always size of extendedContainer and decide if it contains chunk or not
}

func TestExtendedContainerPack(t *testing.T) {
    put := NewExtendedPutAccess(16)
    put.AddInt16(42)
    put.AddBool(true)
    put.AddString(strings.Repeat("A", 16)) // 16‑byte string triggers extended container
    put.AddBytes([]byte{0xAA, 0xBB})

    actual, err := put.PackExtended()
    require.NoError(t, err)

    // --- First segment: headers + primitive payload + ExtendedContainer marker ---
    expectedFirstSegment := []byte{
        0x51, 0x00, // TypeInt16
        0x15, 0x00, // TypeBool
        0x22, 0x00, // ExtendedContainer
        0x2E, 0x00, // TypeBytes
        0x38, 0x00, // TypeEnd

        0x2A, 0x00,             // int16(42)
        0x01,                   // bool(true)
        0x00, 0x00, 0x00, 0x0D, // nextDataOffset
        0xAA, 0xBB,             // bytes payload
    }

    if !bytes.Equal(actual[:len(expectedFirstSegment)], expectedFirstSegment) {
        t.Errorf("First segment mismatch.\nExpected: %v\nActual:   %v",
            expectedFirstSegment, actual[:len(expectedFirstSegment)])
    }

    // --- Second segment: ExtendedContainer with chunk ---
    // Expected layout:
    // 22 00 (ExtendedContainer)
    //  C0 00   // TypeEnd (offset 24)
    // 00 00 00 00 (NextSegmentOffset = 0, last segment)
    // 26 00 (TypeString)
    // 80 00 (TypeEnd)
    // 16 bytes of 'A'
    expectedSecondSegment := []byte{
        0x22, 0x00, // ExtendedContainer  length will be 24 and we will know that it has chunk inside
        0xC0, 0x00, // TypeEnd  // 
        0x00, 0x00, 0x00, 0x00, // NextSegmentOffset
        0x26, 0x00, // TypeString
        0x80, 0x00, // TypeEnd
        // 16 bytes of 'A'
        'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
        'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
    }

    segStart := len(expectedFirstSegment)
    if !bytes.Equal(actual[segStart:segStart+len(expectedSecondSegment)], expectedSecondSegment) {
        t.Errorf("Second segment mismatch.\nExpected: %v\nActual:   %v",
            expectedSecondSegment, actual[segStart:segStart+len(expectedSecondSegment)])
    }
}

then tuple, map, and at the end root itself.

  • you can store segments inside parent PutExtendedAccess
  • triplets could use ID instead of *[]byte.
  • triplets storage could be tree making concatentaion bfs way as well
  • in the string case above if its 14kb it should be in 2-3 (1st just empty placeholder like above) 2nd segment (8192-10) , and last one in the next

@quickwritereader
Copy link
Copy Markdown
Owner

adr02 was rebased onto main. and also you were added as a contributor.
adr001 needs to be redone gradual way. starting with encoding. the current one does not comply with it fully. thats why I will keep its codes inside experimental. check discussions.

thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants